SLIP
SLIP
Technology Browser Exercise I
November 15, 2001
Obtaining informational Transparency with Selective Attention
Dr. Paul S. Prueitt
President, OntologyStream Inc
November 15, 2001
{Eventname,
d_port}
November 15, 2001
One needs two WinZip files, vSLIP and wSLIP.
An analytic conjecture was developed that linked defender ports with a non-specific relationship. A RealSecure summary intrusion event database was used. We used 14,475 records from April 15, 2001. The RealSecure columns are:
{ record, ename, protocol, s_port, d_port, s_addname, d_addname, epriority }
s_addname is the IP address of the source and d_addname is the IP of the defender.
Let us review what is an analytics conjecture.
Figure 1: A simplest form of an analytic conjecture
Formally we have:
( a1
, b ) + ( a2 , b ) à
< a1 , r, a2
>
where r is the non-specific
relationship. The “b” values are from
one column in the intrusion event log and the “a” values are from
a second column in the intrusion event log. We call “b” the “first name” and “a” the
“second name”. The set { a } define the
sets of atoms that are categorized. The
set { b } provides means to define the incident level events that
are consequent of the emergent computing technique that we invented. Incident event description can be
automatically constructed using emergent computing and a reordering of the
values of a derived Report. This fact
was discovered on November 15th, 2001 (see the following example).
The Example:
Let the first column, (b), be the RealSecure
Intrusion Event designation (ename) and the second column, (a), be the
d_port. The unique elements of the
second column of the RealSecure events will a superset of the set of atoms for
this analytic conjecture. There are 602
atoms in the top-level category A1 of the associated SLIP Framework.
There are 49 unique ename values in the event
log. These are the b values that can be
used in the development of event maps.
The set of paired d_port values has 47,780 paired values, each part of the pair being a port
value. The pairs are defined through
the analytic conjecture graph, Figure 1.
Start the SLIP.exe in a folder with a folder
named ‘data”. You need only have two
text files to start with.
1)
Paired.txt is the file containing the 47,780 pairs of
port values.
2)
Datawh.txt
(Data Warehouse) is the file containing the 14,475 RealSecure summary events
records.
These two files are memory mapped and then
searched using new algorithms invented for this purpose. Paired.txt is searched several hundred
thousands times to produce the clustering.
Datawh.txt is searched up to a few hundred times in order to produce a
report from a cluster of atoms. The
report has all of those intrusion event log records that have a
value equal to one of the atoms in the category. When the Report is ordered by first name, then the structure of incident
level events is revealed.
Clearly the first question is about if this
retrieval is the right retrieval. There
are two factors here. One is the order
in which the records are shown, and the other is the examination of the
non-specific relationship that defined the clustering.
Exercise:
Take the winzip file called vSLIP and unzip
it anywhere in a empty directory. Click
on SLIP.exe and when the browser comes up then type in “extract” in the command
line. When the hour glass goes away,
click on the A1 node. Type “mag 10” in
the command line. This will give you
Figure 1a.
a b
Figure 1: First step of the exercise
Typing in “c 100” for cluster 100,000
iterations will produce something like Figure 1 b.
Selecting on A1 gives the view Figure 2a. Selecting on B1 gives the view 2b.
a
b
Figure 2: Drilling down into the data
In Figure 2a we have identified a cluster of
47 elements, at magnification 10. These
are d_ports with a nonspecific relationship defined by the fact that a single
event name is associated in the original database with two or more
d_ports. The returned report has 383
records – the first few one can see in the Report window in Figure 3.
It is interesting to note that this cluster
is diffuse and not narrowly defined.
However, the reader can use the software to scatter gather the atoms of
this cluster, when considered by itself and see that all of the atoms
immediately go to a single spike. In
Figure 2b the magnification is set at 20.
Type in “help” to see the necessary command.
Figure 3.
Category B1
Starting with Figure 1, one can click on Plot and use the “rnd” command to randomize the atoms.
Use the command “x’ where x is any number between 0 and 360 to create an indicator line (Figure 4a) . Use the command “x, y” where x and y are any numbers between 0 and 360 to create an indicator bracket. (One cannot go across 0 in the current version.) Use “x, y -> Tag”, where tag is any short name, to create a new category (Figure 4b).
a
b
Figure 4. The use of the indicator lines and brackets
Clustering (“c”) will show that B1, C1, C2, and C3 are indeed a prime and that each element has at least one other element in the category that is related by the non-specific relationship. Randomizing A1 will allow one to reselect a cluster. By inspection, one can see that any cluster selected will be one of the four already identified. One can delete nodes by physically removing the folders in the Data folder and typing “load”. The Browser reads the folder structure, much like .NET and Java programs to find what is available for display.
For B1 only, the data folders contain the report that is currently generated by the Browser, but also two other texts called Report2.txt and Report3.txt in the B3 folder. One can get the effect of ordering the report by the Event name or the d_port. Just rename the files. The default ordering is the report number. The manipulation of the reports is still under development.
The files to view these structures (Figure 2, and Figure 3) are compressed into wSLIP. You may call Paul Prueitt at 703-981-2676 about questions. Additional investigation of clusters and reports can be made as practice. Comments about the interface should be sent to beadmaster@ontologyStream.com, to help in the continuing design.
One can take the residue, at the B level, by clicking on A1 and typing residue. The new subset (of 542 atoms) will cluster up very much as in Figure 5. These prime categories have all of the features of a single event as characterized by the chain relationship that are causing the clustering. The small size is just to get an example that is easy to look at. In this example this is not possible as the prime categories are easily identified. The current version does not generate reports if the prime has more than 50 atoms. As of November 14th, this has been fixed by is not yet released.
Figure 5: There are four prime events when viewed with this Analytic Conjecture.
The zipped file wSLIP will open up to the view of data as in Figures 5 and 6. However the only report is for B1. The next exercise will demonstrate a functioning Report generator where the Report is an object and can be viewed in two different ordering (temporally and by firstname order – as in Figure 5). A incident event level event map will also be automatically constructed (see Figure 6 and 7).`
The ordering of the Report by the Firstname (in this case ename) shows exactly why the entire category B1 is a prime. One can see this visually using the software, by clicking on B1, typing “rnd” to randomize and “c” to cluster.
Figure 6: The top and bottom of the Report ordered by the Analytic Conjecture
In Figure 7 we see three groups of event names. The three event names have become well specified by the link analysis (Figure 1).
Figure 7: The event map for the category B1
The ordering by the firstname (the b values) from the analytic conjecture produces a visual representation of one specific chaining relationship that are governed by single b values (an event name in this case). This suggests a natural way to specify an event map for each of the prime categories that are found through the clustering process.
Figure 8: A graphic depicting the construction and display of the event map
In Figure 6 we have mocked up how the event maps might look once we put the algorithms in place. All of the Stream_DoS RealSecure intrusion events (there are four of them) are linked together via two non-specific relationships
{ < 1169, r, 1157>, <1157, r, 1614> }
as can be seen in Figure 5. We then have the relationship <1614, r, x> with x having 31 distant values. That links the Stream_DoS intrusion events to 131 TCP_Overlap_Data intrusion events. These intrusion events are then linked to the 6 Windows_Access_Error intrusion events via common use of the ports 1781 and 1853.
The Incident events are then:
{< Stream_DoS, 1614, TCP_Overlap_Data>, < TCP_Overlap_Data, 1781, Windows_Access_Error> ,
< TCP_Overlap_Data, 1614, Windows_Access_Error> }
Other aspects of the event corresponding to category B1 can be developed by looking at the other data such as common IP address usage.
These incident level events are the primary work product that is produced by the SLIP automatically.
It is seen that intrusion events are at a level of organization and the incident events self organize through link analysis and emergent computing.